Variable Support Mining of Frequent Itemsets over Data Streams Using Synopsis Vectors

نویسندگان

  • Ming-Yen Lin
  • Sue-Chen Hsueh
  • Sheng-Kun Hwang
چکیده

Mining frequent itemsets over data streams is an emergent research topic in recent years. Previous approaches generally use a fixed support threshold to discover the patterns in the stream. However, the threshold will be changed to cope with the needs of the users and the characteristics of the incoming data in reality. Changing the threshold implies a re-mining of the whole transactions in a non-streaming environment. Nevertheless, the "look-once" feature of the streaming data cannot provide the discarded transactions so that a remining on the stream is impossible. Therefore, we propose a method for variable support mining of frequent itemsets over the data stream. A synopsis vector is constructed for maintaining statistics of past transactions and is invoked only when necessary. The conducted experimental results show that our approach is efficient and scalable for variable support mining in data streams.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interactive Mining of Frequent Itemsets over Arbitrary Time Intervals in a Data Stream

Mining frequent patterns in a data stream is very challenging for the high complexity of managing patterns with bounded memory against the unbounded data. While many approaches assume a fixed support threshold, a changeable threshold is more realistic, considering the rapid updating of the streaming transactions in practice. Additionally, mining of itemsets over various time granularities rathe...

متن کامل

Mining Frequent Itemsets with Normalized Weight in Continuous Data Streams

A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. The continuous characteristic of streaming data necessitates the use of algorithms that require only one scan over the stream for knowledge discovery. Data mining over data streams should support the flexible trade-off between processing time and mining accuracy. In many application areas, min...

متن کامل

CLAIM: An Efficient Method for Relaxed Frequent Closed Itemsets Mining over Stream Data

Recently, frequent itemsets mining over data streams attracted much attention. However, mining closed itemsets from data stream has not been well addressed. The main difficulty lies in its high complexity of maintenance aroused by the exact model definition of closed itemsets and the dynamic changing of data streams. In data stream scenario, it is sufficient to mining only approximated frequent...

متن کامل

Mining Closed Itemsets in Data Stream Using Formal Concept Analysis

Mining of frequent closed itemsets has been shown to be more efficient thanmining frequent itemsets for generating non-redundant association rules. The task is challenging in data stream environment because of the unbounded nature and no-second-look characteristics. In this paper, we propose an algorithm, CLICI, for mining all recent closed itemsets in landmark window model of online data strea...

متن کامل

An Efficient Algorithm for Mining Frequent Itemsets Within Large Windows Over Data Streams

Sliding window is an interesting model for frequent pattern mining over data stream due to handling concept change by considering recent data. In this study, a novel approximate algorithm for frequent itemset mining is proposed which operates in both transactional and time sensitive sliding window model. This algorithm divides the current window into a set of partitions and estimates the suppor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006